51 research outputs found

    Survey of scientific programming techniques for the management of data-intensive engineering environments

    Get PDF
    The present paper introduces and reviews existing technology and research works in the field of scientific programming methods and techniques in data-intensive engineering environments. More specifically, this survey aims to collect those relevant approaches that have faced the challenge of delivering more advanced and intelligent methods taking advantage of the existing large datasets. Although existing tools and techniques have demonstrated their ability to manage complex engineering processes for the development and operation of safety-critical systems, there is an emerging need to know how existing computational science methods will behave to manage large amounts of data. That is why, authors review both existing open issues in the context of engineering with special focus on scientific programming techniques and hybrid approaches. 1193 journal papers have been found as the representative in these areas screening 935 to finally make a full review of 122. Afterwards, a comprehensive mapping between techniques and engineering and nonengineering domains has been conducted to classify and perform a meta-analysis of the current state of the art. As the main result of this work, a set of 10 challenges for future data-intensive engineering environments have been outlined.The current work has been partially supported by the Research Agreement between the RTVE (the Spanish Radio and Television Corporation) and the UC3M to boost research in the field of Big Data, Linked Data, Complex Network Analysis, and Natural Language. It has also received the support of the Tecnologico Nacional de Mexico (TECNM), National Council of Science and Technology (CONACYT), and the Public Education Secretary (SEP) through PRODEP

    Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation

    Get PDF
    One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model.One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model

    Improving N calculation of the RSI financial indicator using neural networks

    Get PDF
    Proceeding of: 2010 2nd IEEE International Conference on Information and Financial Engineering (ICIFE 2010), 17-19 September 2010, Chongqing, China 2010Trading and Stock Behavioral Analysis Systems require efficient Artificial Intelligence techniques for analyzing large financial datasets and have become in the current economic landscape a significant challenge for multi disciplinary research. Particularly, Trading oriented Decision Support Systems based on the Chartist or Technical Analysis Relative Strength Indicator (RSI) have been published and used worldwide. However, its combination with Neural Networks as a branch of evolutionary computing which can outperform previous results remain a relevant approach which has not deserved enough attention. In this paper, we present the Chartist Analysis Platform for Trading (CAST, in short) platform, a proof of concept architecture and implementation of a Trading Decision Support System based on the RSI N value calculation and Feed Forward Neural Networks (FFNN). CAST provides a set of relatively more accurate financial decisions yielded by the combination of Artificial Intelligence techniques to the N calculation for RSI and a more precise and improved upshot obtained from feed forward algorithms application to stock value datasets.This work is supported by the Spanish Ministry of Industry, Tourism, and Commerce under the project GODO2 (TSI- 020100-2008-564) and SONAR2 (TSI-020100-2008- 665), under the PIBES project of the Spanish Committee of Education & Science (TEC2006-12365-C02-01) and the MID-CBR project of the Spanish Committee of Education & Science (TIN2006-15140-C03-02). Furthermore, this work is supported by the General Council of Superior Technological Education of Mexico (DGEST). Additionally, this work is sponsored by the National Council of Science and Technology (CONACYT) and the Public Education Secretary (SEP) through PROMEPPublicad

    Search in the eye of the beholder: using the personal social dataset and ontology-guided input to improve web search efficiency

    Get PDF
    Proceedings of: Latin American Web Conference 2007 (LA-WEB 2007), 31 October-2 November 2007, Santiago (Chile)Among the challenges of searching the vast information source the Web has become, improving Web search efficiency by different strategies using semantics and the user generated data from Web 2.0 applications remains a promising and interesting approach. In this paper, we present the Personal Social Dataset and Ontology-guided Input strategies and couple them together, providing a proof of concept implementation.Publicad

    Towards Association Rule-based Item Selection Strategy in Computerized Adaptive Testing

    Get PDF
    One of the most important stages of Computerized Adaptive Testing is the selection of items, in which various methods are used, which have certain weaknesses at the time of implementation. Therefore, in this paper, it is proposed the integration of Association Rule Mining as an item selection criterion in a CAT system. We present the analysis of association rule mining algorithms such as Apriori, FP-Growth, PredictiveApriori and Tertius into two data set with the purpose of knowing the advantages and disadvantages of each algorithm and choose the most suitable. We compare the algorithms considering number of rules discovered, average support and confidence, and velocity. According to the experiments, Apriori found rules with greater confidence, support, in less time.Una de las etapas más importantes de las pruebas adaptativas informatizadas es la selección de ítems, en la cual se utilizan diversos métodos que presentan ciertas debilidades al momento de su aplicación. Así, en este trabajo, se propone la integración de la minería de reglas de asociación como criterio de selección de ítems en un sistema CAT. Se presenta el análisis de algoritmos de minería de reglas de asociación como Apriori, FP-Growth, PredictiveApriori y Tertius en dos conjuntos de datos con el fin de conocer las ventajas y desventajas de cada algoritmo y elegir el más adecuado. Se compararon los algoritmos teniendo en cuenta el número de reglas descubiertas, el soporte y confianza promedios y la velocidad. Según los experimentos, Apriori encontró reglas con mayor confianza y soporte en un menor tiempo

    Supply chain knowledge management: A linked data-based approach using SKOS

    Get PDF
    Nowadays, knowledge is a powerful tool in order to obtain benefits within organizations. This is especially true when semantic web technologies are being adapted for the requirements of enterprises. In this regard, the Simple Knowledge Organization System (SKOS) is an area of work developing specifications and standards to support the use of knowledge organization systems. Over recent years, SKOS has become one of the sweet spots in the linked data (LD) ecosystems. In this paper, we propose a linked data-based approach using SKOS, in order to manage the knowledge from supply chains. Additionally, this paper covers how SKOS can be enriched by ontologies and LD to further improve semantic information management. This is due to the fact that the supply chain literature focuses on assets, data, and information elements of exchange between supply chain partners, despite improved integration and collaboration requiring the development of more complex features of know-how and knowledge

    A comparison between the Functional Analysis and the Causal-Loop Diagram to model inventive problems

    Get PDF
    The pressure of the market, the exigencies of the society, and the environmental restrictions ask for new problem-solving approaches. In this context, the Theory of Inventive Problem Solving (TRIZ) offers several advantages: it is a knowledge-based approach for problem-solving that links the problem requirements with some engineering models to guide the solving process. However, the learning process of TRIZ and its use with a practical purpose reveal many drawbacks. A significant problem, while using TRIZ, emerges when the user needs to analyze and formulate an inventive problem. To deal with this issue, a combination of TRIZ with other tools seems the best strategy. The use of the Functional Analysis (FA) is one of the best examples. Despite the usefulness of the FA technique, a difficulty remains: it is a complex task to model the causal relationship between several parameters or conditions within a system. However, a tool used in the System Dynamics Modeling deals well with this situation. The System Dynamics (SD) analyzes the nonlinear behavior of complex systems over time. Congruent with recent TRIZ advances, the SD is a computer aided-approach with an extended application domain, practically in any complex system-social, managerial, economic or natural system defined by some relationships, a flow of information, and some effects of causality. Hence, SD can produce useful information when there are several conflicts in a system, also called a problem network. SD uses a graphical tool to model the variables and states of a system: The Causal-Loop Diagram. This tool is helpful to explain a conflict, the change of a system, or merely the interactions that take place to obtain an effect. This article presents a comparison between the Functional Analysis and the Causal-Loop Diagram to model inventive problems

    Un método de fragmentación híbrida para bases de datos multimedia

    Get PDF
    La fragmentación híbrida es una técnica reconocida para lograr la optimización de consultas tanto en bases de datos relacionales como en bases de datos orientadas a objetos. Debido a la creciente disponibilidad de aplicaciones multimedia, surgió el interés de utilizar técnicas de fragmentación en bases de datos multimedia para tomar ventaja de la reducción en el número de páginas requeridas para responder una consulta, así como de la minimización del intercambio de datos entre sitios. Sin embargo, hasta ahora sólo se ha utilizado fragmentación vertical y horizontal en estas bases de datos. Este artículo presenta un método de fragmentación híbrida para bases de datos multimedia. Este método toma en cuenta el tamaño de los atributos y la selectividad de los predicados para generar esquemas de fragmentación híbridos que reducen el costo de ejecución de las consultas. También, se desarrolla un modelo de costo para evaluar esquemas de fragmentación híbridos en bases de datos multimedia. Finalmente, se presentan algunos experimentos en una base de datos de prueba con el fin de demostrar la eficiencia del método de fragmentación propuesto.Hybrid partitioning has been recognized as a technique to achieve query optimization in relational and object-oriented databases. Due to the increasing availability of multimedia applications, there is an interest in using partitioning techniques in multimedia databases in order to take advantage of the reduction in the number of pages required to answer a query and to minimize data exchange among sites. Nevertheless, until now only vertical and horizontal partitioning have been used in multimedia databases. This paper presents a hybrid partitioning method for multimedia databases. This method takes into account the size of the attributes and the selectivity of the predicates in order to generate hybrid partitioning schemes that reduce the execution cost of the queries. A cost model for evaluating hybrid partitioning schemes in distributed multimedia databases was developed. Experiments in a multimedia database benchmark were performed in order to demonstrate the efficiency of our approach

    Assessing the Impact of a Vinasse Pilot Plant Scale-Up on the Key Processes of the Ethanol Supply Chain

    Get PDF
    One of the byproducts generated in the cane sugar production is molasses, which is used for ethanol distillation. However, one of the problems of distilleries is vinasse. Vinasse is highly water pollutant and is dumped untreated in lakes or rivers and damages the environment. The company FALA developed a pilot plant that uses vinasse to produce a type of livestock feed called MD60. In this paper, the impact of the pilot plant’s scale-up in the key processes of the company’s supply chain is analyzed. With the help of a sensitivity analysis, this study finds the values that would allow the company to improve its order fulfillment indicator and to increase profits, assuming an expected demand by the introduction of this new product into the market. The results show that (1) the pilot plant fulfills 32% of the orders, (2) according to the current vinasse storage capacity, it is possible to fulfill up to 77% of the orders by scaling up the pilot plant, (3) to satisfy 100% of the orders, it is necessary to use all the vinasse generated, and (4) the highest profit is reached by processing all the vinasse and by considering the upper sale price
    corecore